Cross-lingual Sentiment Lexicon Learning With Bilingual Word Graph Label Propagation

نویسندگان

  • Dehong Gao
  • Furu Wei
  • Wenjie Li
  • Xiaohua Liu
  • Ming Zhou
چکیده

In this article we address the task of cross-lingual sentiment lexicon learning, which aims to automatically generate sentiment lexicons for the target languages with available English sentiment lexicons. We formalize the task as a learning problem on a bilingual word graph, in which the intra-language relations among the words in the same language and the interlanguage relations among the words between different languages are properly represented. With the words in the English sentiment lexicon as seeds, we propose a bilingual word graph label propagation approach to induce sentiment polarities of the unlabeled words in the target language. Particularly, we show that both synonym and antonym word relations can be used to build the intra-language relation, and that the word alignment information derived from bilingual parallel sentences can be effectively leveraged to build the inter-language relation. The evaluation of Chinese sentiment lexicon learning shows that the proposed approach outperforms existing approaches in both precision and recall. Experiments conducted on the NTCIR data set further demonstrate the effectiveness of the learned sentiment lexicon in sentence-level sentiment classification.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cross-Lingual Sentiment Analysis Without (Good) Translation

Current approaches to cross-lingual sentiment analysis try to leverage the wealth of labeled English data using bilingual lexicons, bilingual vector space embeddings, or machine translation systems. Here we show that it is possible to use a single linear transformation, with as few as 2000 word pairs, to capture fine-grained sentiment relationships between words in a cross-lingual setting. We a...

متن کامل

Cross-Lingual Sentiment Classification with Bilingual Document Representation Learning

Cross-lingual sentiment classification aims to adapt the sentiment resource in a resource-rich language to a resource-poor language. In this study, we propose a representation learning approach which simultaneously learns vector representations for the texts in both the source and the target languages. Different from previous research which only gets bilingual word embedding, our Bilingual Docu...

متن کامل

Earth Mover's Distance Minimization for Unsupervised Bilingual Lexicon Induction

Cross-lingual natural language processing hinges on the premise that there exists invariance across languages. At the word level, researchers have identified such invariance in the word embedding semantic spaces of different languages. However, in order to connect the separate spaces, cross-lingual supervision encoded in parallel data is typically required. In this paper, we attempt to establis...

متن کامل

Semi-Supervised Affective Meaning Lexicon Expansion Using Semantic and Distributed Word Representations

In this paper, we propose an extension to graph-based sentiment lexicon induction methods by incorporating distributed and semantic word representations in building the similarity graph to expand a threedimensional sentiment lexicon. We also implemented and evaluated the label propagation using four different word representations and similarity metrics. Our comprehensive evaluation of the four ...

متن کامل

Adversarial Training for Unsupervised Bilingual Lexicon Induction

Word embeddings are well known to capture linguistic regularities of the language on which they are trained. Researchers also observe that these regularities can transfer across languages. However, previous endeavors to connect separate monolingual word embeddings typically require cross-lingual signals as supervision, either in the form of parallel corpus or seed lexicon. In this work, we show...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computational Linguistics

دوره 41  شماره 

صفحات  -

تاریخ انتشار 2015